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Abstract 

The purpose of this analysis was to assess whether effects of first-grade mathematics intervention 
apply across the range of at-risk learners’ initial skill levels. Students were randomly assigned to 
control (n = 213) and two variants of intervention (n = 385) designed to improve arithmetic. 
Of each 30-minute intervention session (48 over 16 weeks), 25 minutes were identical in the 
two variants, focused on number knowledge that provides the conceptual bases for arithmetic. 
The other five minutes provided nonspeeded conceptual practice (n = 196) or speeded 
strategic practice (n = 199). Contrasts tested effects of intervention (combined across variants) 
versus control and effects between the variants. Moderation analysis indicated no significant 
interactions between at-risk children’s pre-intervention mathematics skill and either contrast 
on any outcome. Across pre-intervention math skill, effects favored intervention over control 
on arithmetic and transfer to double-digit calculations and number knowledge, and favored 
speeded over nonspeeded practice on arithmetic. 


When a randomized control trial (RCT) pro- 
duces statistically significant effects favoring 
the learning outcomes of students who receive 
intervention over those who do not, that inter- 
vention is deemed validated. Validation sug- 
gests most students respond to the intervention, 
but few, if any, standard (non-individualized) 
interventions achieve universal response. 
Some students require adjustments to make 
intervention more intensive (O’Connor & 
Fuchs, 2013). 

Little is known about student characteris- 
tics that explain responsiveness. One possi- 
bility is that the robustness of intervention 
effects depends on the level of students’ pre- 
intervention academic skill. We identified 
two previous studies that assessed the efficacy 
of mathematics intervention as a function of 
students’ pre-intervention math performance. 
In Smith, Cobb, Farran, Cordray, and Munter 
(2013), first graders who received Math 
Recovery outperformed the control group on 


arithmetic, concepts and applications, quanti- 
tative concepts, and math reasoning. Effect 
sizes (ESs) ranged from 0.15 to 0.30, but 
were larger for children who began interven- 
tion below the 25th percentile (ESs = 0.31- 
0.40) than for students who exceeded the 
25th percentile. This suggests students with 
greater math competence had less need for 
Math Recovery. Yet two-thirds of Smith 
et al.’s “at-risk” sample began intervention 
scoring above the 25th percentile, and scores 
ranged as high as the 95th percentile. 

By contrast, in a sample selected to represent 
the distribution of at-risk learners’ low mathe- 
matics skill (1st-34th percentile, sampled pur- 
posely to represent the full distribution), Fuchs, 
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Sterba, Fuchs, and Malone (2016) tested 
whether pre-intervention math performance 
moderated the effects of a fourth-grade frac- 
tions intervention. No moderation effect was 
identified: Students benefited comparably 
from the intervention, with similar magnitude 
of effects for at-risk intervention students over 
at-risk control group students across points 
along pre-intervention whole-number math 
skill. Although this suggests the robustness of 
intervention, it may indicate that pre-interven- 
tion mathematics performance is not a viable 
basis for forecasting which students will 
respond inadequately and are in need of more 
intensive intervention. 

Yet, fractions at fourth grade have some 
features that may not generalize to other 
mathematics topics. For example, the princi- 
ples that guide whole-number understanding 
(used to index pre-intervention math skill at 
the start of the Fuchs et al., 2016 intervention) 
differ from those guiding rational-number 
thinking (used to index intervention outcome), 
and the whole-number calculations involved 
in fourth-grade fractions are relatively simple. 
Thus, in the initial phases of fractions learn- 
ing, students do not require strong whole- 
number knowledge and operational skill to 
succeed. Clearly, additional research is needed 
to examine at-risk pre-intervention math skill 
as a moderator of intervention effects in which 
the moderator represents a critical founda- 
tional skill for the math learning addressed 
during intervention. 


Purpose of the Present 
Analysis 


The purpose of the present analysis was to 
revisit whether pre-intervention mathematics 
performance moderates intervention efficacy 
using a first-grade whole-number intervention 
designed to improve children’s arithmetic 
(Fuchs et al., 2013). Pre-intervention whole- 
number understanding and operational skill 
(the moderator variable) is important for suc- 
cess with arithmetic, and a good distribution 
of performance on whole-number understand- 
ing and skill exists at the start of first grade. 
We selected the Fuchs et al. (2013) study for 


moderation analysis for this and three addi- 
tional reasons. 

The second reason was that first-grade 
arithmetic skill is a critical target. It predicts 
mathematics learning through the end of fifth 
grade (Geary, 2011) and eventual mastery 
of high school algebra, a gateway for later 
entry into mathematics-intensive fields (U.S. 
Department of Education, 2008). Third, the 
screening criteria for entering the study were 
designed to create a viable distribution of pre- 
intervention scores at the at-risk end of math 
performance. Fourth, pre- and postinterven- 
tion normative data were available on the same 
pre- and post-intervention measures for not-at- 
risk classmates. This permitted us not only to 
look at moderation analysis, in which the per- 
formance of intervention students is contrasted 
to at-risk control group students, but also to 
consider rates of responsiveness-to-interven- 
tion; evaluating at-risk students’ growth and 
end-of-intervention performance against not-at- 
risk classmates on the same measures. 

Results may provide insight into how pre- 
intervention math performance affects respon- 
siveness to generally effective intervention, 
with implications for making intervention 
decisions more efficient and effective. For 
example, if results show that intervention fails 
to serve the needs of students with more 
severe pre-intervention math deficits, schools 
might place these students in intensive inter- 
vention immediately, before they experience 
failure in a standard intervention. Alterna- 
tively, if results show that intervention fails to 
serve the needs of students with Jess severe 
pre-intervention math deficits, schools might 
forgo placing these students in intervention 
and instead focus on adjustments to the gen- 
eral education program. 


Information on First-Grade 
Math Development and Our 
Approach to Intervention 


Before describing study methods and our ana- 
lytic procedures, we contextualize the Fuchs 
et al. (2013) study by describing how compe- 
tence with first-grade arithmetic develops and 
how our approach to intervention reflects the 
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developmental pathways. When children 
enter first grade, most have a rudimentary 
understanding of addition and subtraction and 
can count to solve problems (Geary, 1994). 
For addition, young children count both 
addends; for subtraction, they represent the 
beginning quantity with objects, sequentially 
separate the number of objects to be sub- 
tracted, and then count the remaining set 
(Groen & Resnick, 1977). As understanding of 
cardinality and the counting sequence develop, 
children discover the number-after rule for 
adding with 1. They also come to understand 
that the sum of 5 + 2 is two numbers beyond 5, 
which leads them to discover the efficiency of 
counting from the first addend and to rely on 
more efficient counting procedures. For addi- 
tion, the most efficient counting procedure 
involves starting with the cardinal value of the 
larger addend and counting up the number of 
times equal to the smaller addend (for 2 + 3 = 
“two: three, four, five”); for subtraction, start- 
ing with the subtrahend and counting to the 
minuend (for 7 — 4 = “four: five, six, seven”’; 
the answer is the number of counts). 

Frequent use of efficient counting proce- 
dures reliably produces the correct association 
between problem and answer, which results in 
long-term memories (Fuson & Kwon, 1992; 
Siegler & Robinson, 1982). This enables direct 
retrieval of answers, and the commutativity of 
addition facilitates retrieval of related addition 
problems (Rickard, Healy, & Bourne, 1994). 
Subtraction, which is not commutative, is more 
difficult. It is facilitated by retrieval of related 
addition facts (e.g., 8 — 5 = 3, based on 5+ 3 = 
8; LeFevre & Morris, 1999) once children 
understand the inverse relation between addi- 
tion and subtraction (Geary, Boykin, Embretson, 
Reyna, Siegler, Berch, & Graban, 2008). 

Difficulty with arithmetic is an indicator of 
risk for long-term learning disabilities (Geary, 
Hoard, Nugent, & Bailey, 2012). Students with 
mathematics learning disabilities show consis- 
tent delays in the adoption of efficient counting 
procedures, make more counting errors during 
their execution, and fail to make the shift toward 
memory-based retrieval of answers (e.g., Geary 
et al., 2012; Goldman, Pellegrino, & Mertz, 
1988). Most eventually catch up to peers in 


skilled use of counting procedures, but difficulty 
with retrieval tends to persist (Geary et al., 2012; 
Geary, Hoard, Byrd-Craven, Nugent, & Numtee, 
2007; Jordan, Hanich, & Kaplan, 2003). 

The major emphasis of the intervention in 
the present analysis was developing the con- 
ceptual bases for arithmetic, as reflected in the 
developmental pathways described previ- 
ously. The five minutes of each 30-minute 
session devoted to practice was nonspeeded 
(reviewing the conceptual bases underpinning 
arithmetic problems) or speeded (promoting 
strategic, quick responding and use of effi- 
cient counting procedures to generate many 
correct responses to arithmetic problems). 

In the Fuchs et al. (2013) RCT, we investi- 
gated the efficacy of intervention on first grad- 
ers’ competence with arithmetic while 
assessing transfer to two-digit calculations and 
an integrated measure of number knowledge. 
We considered calculations a form of transfer 
because two-digit calculations was not a major 
focus of intervention. We considered the num- 
ber knowledge task transfer because it repre- 
sented an integrative form of knowledge 
across multiple dimensions of number knowl- 
edge and it was novel, not explicitly taught or 
practiced during intervention. (We omitted 
word problems from the present analysis 
despite significant word-problem effects in 
Fuchs et al. due to space constraints and 
because word problems was not a major focus 
of intervention. They were used only to con- 
textualize number sentences. Also, results par- 
alleled findings for the other transfer tasks.) 

For the present analysis, we asked two 
questions. For the first, “Does the effect 
of math intervention compared to control dif- 
fer depending on students’ pre-intervention 
math skill?,” we combined the two interven- 
tion conditions and estimated the effect 
between intervention versus control. For the 
second, “Does the effect between the two 
types of practice conditions differ depending 
on pre-intervention mathematics skill?,” we 
compared math intervention with speeded 
practice to math intervention with nonspeeded 
practice. For each outcome, the corresponding 
pre-intervention score was treated as the mod- 
erator of treatment effects. 
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na condition. Control group children were 73.2% 
Participants African American, 17.4% non-Hispanic 


Additional information on participants and 
other methods is available in Fuchs et al. 
(2013). Across four cohorts in four consecu- 
tive years, we recruited 40 schools and 233 
first-grade classes. We relied on a latent class 
approach to screen the first cohort of children 
for at-risk and not-at-risk status by combining 
scores across math applications, concepts, cal- 
culations, and word-reading screeners into a 
single latent factor, used to designate risk sta- 
tus. For remaining cohorts, we used the first- 
year cut-points for consistency. We excluded 
students whose teachers identified them as 
non-English speakers or with standard scores 
below 80 on both subtests of the two-subtest 
Wechsler Abbreviated Scale of Intelligence 
(WASI; Wechsler, 1999). 

We enrolled into the study 648 at-risk 
(below the 40th percentile on the latent factor 
score) and 325 not-at-risk (above the 40% 
percentile) students from 227 classes. (The 
not-at-risk sample was used to set responsive- 
ness criteria. The other six classrooms did not 
include enough at-risk students to participate.) 
Then we randomly assigned at-risk students at 
the individual level, stratifying by pre-inter- 
vention math scores and classrooms, to three 
conditions: control, conceptual arithmetic 
intervention with speeded practice (A+SP), 
and conceptual arithmetic intervention with 
nonspeeded practice (A+NSP). 

During first grade, some not-at-risk and at- 
risk students (distributed comparably across 
conditions) moved outside the study’s reach, 
leaving 307 not-at-risk and 608 at-risk students 
in 218 classrooms from 39 schools. Of the 608 
at-risk students, 213 were in the control group, 
199 in the A+SP condition, and 196 in the 
A+NSP condition. Sample size for this analysis 
is slightly larger than reported in Fuchs et al. 
(2013) (591 vs. 608 at-risk; 300 vs. 307 not-at- 
risk) because Fuchs et al.’s analyses were 
restricted to children who were also tested on a 
battery of cognitive process measures. 

See Table 1 for pre-intervention scores on 
the Wide Range Achievement Test-3-Arithme- 


White, 6.1% Hispanic White, and 3.3% other. 
A+SP children were 67.8% African American, 
22.6% non-Hispanic White, 6.5% Hispanic 
White, and 0% Other. A+NSP children were 
67.3% African American, 20.4% non-Hispanic 
White, 8.2% Hispanic White, and 4.1% Other. 
Not-at-risk children were 39.7% African 
American, 41.7% non-Hispanic White, 8.5% 
Hispanic White, and 10.1% Other. 


Intervention 


Intervention, which addresses the conceptual 
and procedural bases for emerging compe- 
tence with arithmetic, occurred three times per 
week, 30 minutes per session, for 16 weeks in 
a quiet location outside of classrooms. Make- 
ups ensured 48 sessions. The program is orga- 
nized in a manual (Galaxy Math; Fuchs, Fuchs, 
& Bryant, 2010) with materials and guides that 
provide each lesson’s structure, content, and 
language of explanation. To ensure the natural 
flow of interactions and responsiveness to stu- 
dent difficulties, tutors review but do not read 
from or memorize lesson guides. To foster 
engagement, the program uses a space theme. 
Each lesson includes a 25-minute segment on 
the conceptual bases for arithmetic and five 
minutes of practice to support accurate arith- 
metic skill. Content and activities in the 
25-minute segment were the same in the two 
practice conditions. 

Lessons are organized in five units. Unit 1 
addresses basic number knowledge; Unit 2, 
arithmetic doubles; Unit 3, arithmetic sets 5 
through 12 (e.g., the 5 set includes all prob- 
lems with sums or minuends of 5); and Unit 4, 
10s concepts. Students who advance quickly 
through most lessons also complete Unit 5, a 
review set of lessons. Instruction incorporates 
manipulatives and number lines (1-19 through 
Unit 3; 1-100 for Unit 4). 

Unit 3, which comprises approximately 
half the program, focuses on partitioning 
numbers into constituent sets and number 
families (e.g., for the 5 set,0+5,1+4,2+3, 
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Table |. Descriptive Pre-Intervention Performance and Pre- and Posttreatment Outcomes for At-Risk 


Students by Condition and Not-At-Risk Students. 


At Risk 
Control A+SP A+NSP Not at Risk 
Variable Mean SD Mean SD Mean SD Mean SD 
Descriptive 
WRAT 88.81 12.05 89.15 11.67 89.68 12.76 107.29 11.95 
WASI IQ 85.63 8.07 85.76 781 85.46 8.45 100.37 13.34 
Outcomes 
Arithmetic-pre 12.42 731 12.56 7.58 12.47 6.78 29.91 11.85 
Post 22.18 11.58 33.27 14.02 27.97 11.82 42.73 14.21 
Calculations-pre 2.66 2.71 2.86 2.85 2.80 2.83 8.64 6.59 
Post 6.08 5.89 10.28 7.78 9.18 6.76 15.45 9.72 
Number knowledge-pre -0.51 0.72 -0.50 0.83 -0.49 0.67 1.02 1.02 
Post -0.76 1.23 -0.37 1.34 -0.49 1.22 1.09 1.07 


Note: A+NSP = Arithmetic plus nonspeeded practice; A+SP = Arithmetic plus speeded practice; WASI = Wechsler 
Abbreviated Scale of Intelligence; WRAT = Wide Range Achievement Test-3-Arithmetic. 


5-0,5-—1,5-2, etc.), with five activities per 
lesson. First, tutors and students use unifix 
cubes to explore how the target number can be 
partitioned in different ways to derive the 
addition and subtraction problems in that set. 
The second activity focuses on number fami- 
lies in that set, with visual displays that group 
families in the set and blocks to help students 
rely on part-whole knowledge to understand 
why four problems make families. Third, for 
the target set, students generate all addition 
and subtraction problems, using rockets to 
show problems. Fourth, tutors and students 
create a story problem together on that set, 
produce the answer, and explain why the 
problem belongs in the set. Fifth, students 
review previous sets with corrective feedback. 
Between one and four lessons occur in each 
set; mastery criteria determine the pace with 
which students move through sets. 

The content addressed in the final 5-minute 
practice segment of each lesson was the same 
in both practice conditions: that day’s num- 
ber knowledge lesson (through Lesson 4: 
number identification, magnitudes, greater 
or less than; after Lesson 4: sets with +0, 1, 
and 2, and then sets with sums and minuends 
of 5—12). Practice activities differed by condi- 
tion. Nonspeeded practice reinforces thought- 


ful application of the relations and principles 
that serve as reasoning strategies to support 
arithmetic procedural skill. Students use space 
theme manipulatives to play games that pro- 
vide contextualized review of that day’s les- 
son. For example, for n + 1 lessons, children 
spin a dial segmented from | to 19 to identify 
the number of “rockets called to explore the 
math galaxy” and count this number of rock- 
ets onto the game board. Then, the tutor 
informs the child that one more rocket is 
needed or one is called back to the space sta- 
tion, so the child adds or takes one away. The 
child then states the number sentence with 
answer. For an 8 set lesson, children are 
informed how many rockets constitute the 
fleet and write that numeral as the total. Then 
they roll a die to find the first group of rockets 
released from the space station, count that 
number onto the game board, and write the 
numeral as an addend. Then, they determine 
how many more rockets are needed to com- 
plete the fleet, write that numeral as an 
addend, and read the number sentence. Next, 
they roll the die to find how many rockets are 
called back to the space station and write 
numerals to generate and read a number sen- 
tence. Games differ for each day on the same 
topic. In nonspeeded practice and lessons, 
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tutors encourage children to know the answer 
or rely on number principle strategies, includ- 
ing using number lines, arithmetic principles, 
and efficient counting strategies. “Knowing 
the answer right off the bat” is the preferred 
strategy when students are sure of answers. 

Speeded practice promotes quick respond- 
ing and use of efficient counting procedures to 
generate many correct answers, with the goal 
of forming long-term memory representations 
for retrieval. Students complete the “Meet or 
Beat Your Score” game, which provides 90 
seconds to answer flash cards (e.g., for n + 1 
lessons, flashcards are n + | and 1 +n prob- 
lems where n = 0-18; for 8 set lessons, flash- 
cards are addition problems with the sum 8 
and subtraction problems with the minuend 
8). Each presented problem is answered cor- 
rectly because as soon as an error occurs, 
tutors require children to use the taught count- 
ing strategy to produce the correct response. 
Time elapses as children use the counting pro- 
cedure (as many times as needed). So, careful 
but quick responding increases the number of 
correct responses, which is charted on a 
Rocket Chart at the end of 90 seconds. Then, 
the child has two chances to meet or beat that 
score. In speeded practice and lessons, tutors 
require children to know the answer (retrieve) 
or use the efficient counting strategies they 
have been taught. “Knowing the answer right 
off the bat” is preferred if the child is confi- 
dent of the answer. 


Measures and Data Collection 


To index arithmetic skill, we used Arithmetic 
Combinations (Fuchs, Hamlett, & Powell, 
2003), with Addition (25 problems, sums 5 
to 12) and Subtraction (25 problems, minu- 
ends 5 to 12). For each subtest, students have 
one minute to write answers. We used total 
number of correct answers across Addition 
and Subtraction. On this sample, a was .96. 
To index transfer to more complex calcula- 
tions, we used Double-Digit Addition and 
Subtraction (Fuchs et al., 2003), with two 
subtests: Addition (20 two-digit addition prob- 
lems with and without regrouping) and Sub- 
traction (20 two-digit subtraction problems 


with and without regrouping). For each sub- 
test, students have five minutes to write 
answers. On this sample, a was .94. 

To assess transfer to an integrative number 
knowledge task, we used the Number Sets 
Test (Geary, Bailey, & Hoard, 2009), which 
indexes the speed and accuracy with which 
children understand and operate with small 
numerosities (<10) while transcoding between 
quantities and symbols. At the top of the page, 
the target sum (5 or 9) is shown. For each tar- 
get sum, dominoes on the first page contain 
arrays of objects with same or different 
objects; the second page shows objects with 
Arabic numerals and Arabic numerals with 
Arabic numerals. The child circles groups that 
combine to make the target sum, with 60 sec- 
onds per page for the sum 5 and 90 seconds 
for the sum 9. Signal detection methods are 
applied to the number of hits and false alarms 
to generate a d' variable representing the 
child’s sensitivity to quantities (Geary et al., 
2009). Test-retest reliability on a sample of 50 
participants was .89. 


Fidelity of Implementation and Data 
Collection 


Every intervention session was audiotaped. 
We randomly sampled 20% of recordings 
such that tutor, student, and lesson were 
sampled comparably. A research assistant 
listened to each sampled tape while com- 
pleting a checklist to identify the essential 
points the tutor implemented. Agreement 
exceeded 97%. Research assistants, unfa- 
miliar to the children they tested, adminis- 
tered measures in groups. We audiotaped 
individual test sessions and rescored 20% of 
recordings. Agreement exceeded 98%. 


Data Analysis 


For the first question, “Does the effect of num- 
ber knowledge intervention compared to con- 
trol differ depending on __ students’ 
pre-intervention skill?”, we compared the effect 
across the two intervention conditions against 
control. For the second question, “Does the dif- 
ference in effect between the two types of num- 
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ber knowledge intervention differ depending on 
pre-intervention skill?”, we compared A+SP 
against A+NSP. For each outcome, the corre- 
sponding pre-intervention score was treated as 
the moderator of the intervention effect. To 
answer our questions, we used two orthogonal 
contrast codes. The estimate of the first, cl_ 
TvC (control .66, AtSP = .33, and 
A+NSP= .33), represents the mean difference 
between students in intervention versus control. 
The estimate of the second, c2_ A+SPvA+NSP 
(control = 0, A+SP=.5, and A+NSP = —.5), rep- 
resents the mean difference between A+SP ver- 
sus A+NSP. 

Prior to running moderation analyses, 
outcome data were screened for nonnormal- 
ity and extreme values. One posttreatment 
arithmetic score, nearly 4 standard deviation 
(SD) above the mean, was winsorized to the 
next closest value. Four scores on posttreat- 
ment number sense were at least 2.5 SD 
below the mean and discrepant from the 
remainder of the distribution. These scores 
were winsorized to the next closest values. 
Then, pre-intervention comparability among 
conditions was examined with three analysis 
of variance (ANOVA) models. No significant 
differences were detected among groups for 
arithmetic, F(2, 605) = 0.02, p = .982; (square 
root of) calculations, F(2, 605) = 0.88, p = 
415; or number knowledge, F(2, 605) = 0.05, 
p=.947. 

Our data structure incorporated three lev- 
els: students (Level 1), cross-classified by 
classrooms and teachers (Level 2), and class- 
rooms and teachers nested in schools (Level 3). 
For each outcome, we ran unconditional mul- 
tilevel models including a random effect for 
classrooms, teachers, and schools to judge the 
necessity of including each in the final model. 
Further, because we assumed that residual 
variance might vary by condition, we esti- 
mated separate residual variances for each. In 
the final model, we retained all nonzero ran- 
dom effects. We relied on likelihood ratio 
tests to signal the need for heteroscedastic or 
homoscedastic residuals by condition. 

For the three moderator models, pre-inter- 
vention variables were grand-mean-centered 
before generating interaction variables. Inter- 


action variables were calculated by multiplying 
the centered pre-intervention score (the same 
measure as the outcome but measured prior to 
intervention) by both contrast codes. Equation 
1 represents the final generic model: 


v=By +B, *cPre+B, *cl_TvC+B, 
*c2_ DPvG+B,*cPre*cl_TvC (1) 
+B, *cPre*c2_DPvG+ uo, + Ty 


Fo j2)e F Ei j1y(72)k 


J\jk 


where f, is the intercept (the average post- 
intervention score across conditions for stu- 
dents with the average pre-intervention score); 
B, is the effect of the pre-intervention vari- 
able; 8, is the mean difference between inter- 
vention and control conditions, controlling for 
pre-intervention scores; B, is the mean differ- 
ence between A+SP and A+NSP conditions, 
controlling for pre-intervention scores; B, 1s 
the interaction effect between pre-interven- 
tion scores and Contrast 1, intervention versus 
control; B,; is the interaction effect between 
pre-intervention scores and Contrast 2, A+SP 
versus A+NSP; U,, is the random residual for 


school k; To ji)k is the random residual for 


classroom j(1) in school k; To j2) is the ran- 


k 
dom residual for teacher j(2) in school k; and 
E(j1)(72)k is the random residual for student i 
in classroom j(1) taught by teacher j(2) in 
school k. The / subscripts are assigned to both 
classrooms and teachers to signify that those 
effects are crossed at the same level. This 
model assumes homoscedasticity of Level 1 
residuals across conditions. However, separate 
variance components across conditions were 
estimated if determined necessary by likelihood 
ratio tests (Sterba, 2017). 

Moderator models were first run using 
Stata’s quietly command in which results 
were hidden from view but residuals were 
available for inspection. Level 1 residuals 
were examined for violations of normality 
and homoscedasticity. We ran the final mod- 
els and obtained results only after making 
necessary remediations, as described in the 
Results section. 
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Results 


Multilevel Multivariate Inferential 
Models 


Arithmetic. In the initial arithmetic model, three 
multivariate outliers were detected. Those 
cases were removed before estimating a final 
moderator model, in which the Level | residu- 
als met assumptions of normality and homosce- 
dasticity. Neither interaction was statistically 
significant at the a = .05 level, B, = 0.05, 
SE = 0.10, p = .597, and B; = 0.21, SE=0.14, 
p = .137. In Figure 1, we graphed the nonsig- 
nificant interactions. The parallel lines illus- 
trate similar intervention effects across the 
distribution of pre-intervention scores. The 
bottom graph shows a slightly larger effect of 
A+SP over A+NSP at the upper end of the pre- 
intervention arithmetic distribution than at the 
lower end, but again this interaction (p = .137) 
was not statistically significant. 

Because neither interaction was signifi- 
cant, we removed both and reran a main 
effects model to calculate treatment ESs 
(Hedges’s gs based on model coefficients; 
U.S. Department of Education, 2013), with 
significant effects for both contrasts, B, = 
8.54, SE = 0.72, p < .001, ES = 0.69 (mean of 
both intervention conditions vs. control), and 
B; = 5.65, SE = 1.02, p < .001, ES = 0.44 
(A+SP vs. A+NSP). We accounted for the 
potential inflation of Type I error due to mul- 
tiple comparisons across contrasts and out- 
comes by using the Benjamini-Hochberg 
(Benjamini & Hochberg, 1995) false discov- 
ery rate to adjust critical p values. The main 
effects remained significant even with the cor- 
rection. Table | shows simple pre- and post- 
intervention means and SDs by condition on 
all three outcomes. 


Calculations. Residuals from the initial 
calculations model failed normality and 
homoscedasticity assumptions. Pre- and 
post-intervention calculations variables had 
right-skewed distributions, so the square 
root function was applied, resulting in 
more normal distributions. The model was 
rerun using the transformed variables; Level 


1 residuals from this model were normal and 
homoscedastic. Neither interaction was statis- 
tically significant at the a = .05 level, By = 
—0.04, SE = 0.11, p = .730, and B; = —0.04, 
SE = 0.13, p = .767. In Figure 2, parallel lines 
in both graphs illustrate these nonsignificant 
interactions. 

Interaction terms were removed from the 
model to run a main effects model, with a 
significant effect for Contrast 1, B, = 0.66, SE 
= 0.09, p < .001, ES = 0.53 (both intervention 
conditions vs. control), but not Contrast 2, B; 
= 0.22, SE = 0.11, p = .047, ES = 0.17 (A+SP 
vs. A+NSP), after adjusting critical p value 
cutoffs to control the false discovery rate 
(Benjamini & Hochberg, 1995). Thus, on the 
calculations outcome, the difference between 
intervention and control was reliable but not 
the difference between the two practice condi- 
tions. 


Number knowledge. Initial models produced 
seven multivariate outliers as detected by stan- 
dardized residuals >|3|. Even after omitting 
these values, the residuals appeared somewhat 
heteroscedastic, so we employed the Huber- 
White sandwich estimator to correct standard 
errors in the final model. On arithmetic and 
calculations, neither interaction was. statis- 
tically significant at the a = .05 level, B, = 
0.01, SE = 0.11, p = .920, and B; = —0.02, 
SE = 0.17, p = .895. Figure 3 illustrates this 
finding. The main effects model without the 
interaction terms revealed a significant main 
effect for Contrast 1,8, = 0.33, SE = 0.08, p< 
.001, ES = 0.27 (both intervention conditions 
vs. control), but not Contrast 2, B, = 0.14, SE 
= 0.10, p=.134, ES =0.11 (A+SP vs. A+NSP). 


Responsiveness to Intervention 
in Terms of Growth and Post- 
Intervention Level 


We also considered the proportion of children 
classified as inadequately responsive to inter- 
vention. In these analyses, we operationalized 
inadequate response in two ways: growth and 
final status, both based on the normalization 
principle (Frijters, Lovett, Sevcik, & Morris, 
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Figure |. Graphical illustration of nonsignificant interaction effects on arithmetic: Top panel is 
intervention versus control conditions; bottom panel is At+SP (arithmetic concepts plus speeded 
practice) versus At+NSP (arithmetic concepts plus nonspeeded) conditions. 


2013; Fuchs, 2003). For growth, inadequate 
response was defined as improvement (post- 
intervention performance minus pre-interven- 
tion performance) below the 25th percentile 
of the not-at-risk classmates’ distribution of 
improvement scores. For final status, inade- 
quate response was defined as post-interven- 
tion score below the 25th percentile of the 
not-at-risk classmates’ distribution of post- 
intervention scores. 


On arithmetic, inadequate growth occurred 
for 31% of control students, 6% of A+SP stu- 
dents, and 13% of A+NSP students. By con- 
trast, inadequate post-intervention performance 
level occurred for 84% of control students, 
48% of A+SP students, and 64% of A+NSP 
students. On the calculations transfer outcome, 
inadequate growth occurred for 45% of con- 
trol students, 22% of A+SP students, and 26% 
of A+NSP students. By contrast, inadequate 
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Figure 2. Graphical illustration of nonsignificant interaction effects on (square root) transfer to 
calculations: Top panel is intervention versus control conditions; bottom panel is A+SP (arithmetic 
concepts plus speeded practice) versus A+NSP (arithmetic concepts plus nonspeeded) conditions. 


post-intervention performance level occurred 
for 71% of control students, 41% of A+SP 
students, and 46% of A+NSP students. On the 
number knowledge transfer outcome, inade- 
quate growth occurred for 55% of control stu- 
dents, 23% of A+SP students, and 27% of 
A+NSP students. By contrast, inadequate post- 
intervention performance level occurred for 
84% of control students, 68% of A+SP stu- 
dents, and 74% of A+NSP students. 


Discussion 


In this discussion, we interpret main effect 
results favoring intervention (across both 
practice conditions) versus control on all three 
outcomes and favoring speeded over non- 
speeded practice on arithmetic. Then, we con- 
sider implications of finding that students’ 
pre-intervention math skill did not moderate 
these main effects. Finally, we use individual 
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Figure 3. Graphical illustration of nonsignificant interaction effects on number sense transfer: Top 
panel is intervention versus control conditions; bottom panel is A+SP (arithmetic concepts plus speeded 
practice) versus At+NSP (arithmetic concepts plus nonspeeded) conditions. 


all three outcomes. This was the case even on 
this study’s most distal outcome, the Number 
Sets Test (Geary et al., 2009). This task was 
novel to intervention students, whereas in some 
first-grade efficacy studies, number knowledge 
is indexed using tasks aligned with activities 


student responsiveness data to contextualize 
results. 


Effects of Number Knowledge 
Intervention Compared to the 


At-Risk Control Group 


Students who received intervention significantly 
outperformed at-risk control group children on 


explicitly taught and practiced during interven- 
tion, such as magnitude comparison, counting, 
and ordering. With the Number Sets Test, by 
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contrast, children are challenged with an inte- 
grative task in which they combine the sides of 
dominos that show arrays of objects and Arabic 
numerals to determine whether sums match 
target numbers. The task is speeded to simulta- 
neously index accuracy on and fluency with 
cardinality, subitizing, counting, identifying 
numerals, understanding symbolic and non- 
symbolic quantity, number decomposition, and 
arithmetic principles. In this study, this number 
knowledge measure represented a transfer chal- 
lenge. So the ES of 0.27, favoring intervention 
over at-risk control group students, is notable. 


Across the three outcomes, the main 
effects contrasting both intervention 
conditions against the at-risk 
control group provide persuasive 
evidence of the efficacy of this 
mathematics intervention. 


In terms of the study’s near transfer task, 
two-digit calculations with and _ without 
regrouping, intervention introduced students to 
place value and calculation strategies, but this 
focus was limited to 6 of 48 sessions. There- 
fore, the ES favoring intervention over at-risk 
control of 0.53 was impressive and probably 
carried at least in part by intervention students’ 
superior arithmetic skill. On arithmetic, which 
was the central focus of intervention, the ES 
was large: 0.69. This is important because 
arithmetic is a core mathematical competency 
and a critical intervention target for first-grade 
children at risk for mathematics learning dis- 
abilities (e.g., Fuchs et al., 2006; Geary, 2011; 
Geary et al., 2012; U.S. Department of Educa- 
tion, 2008; Goldman et al., 1988; Jordan et al., 
2003). Across the three outcomes, the main 
effects contrasting both intervention conditions 
against the at-risk control group provide per- 
suasive evidence of the efficacy of this mathe- 
matics intervention. 


Does Speeded Practice Provide Added 
Value Over Nonspeeded Practice? 


The Fuchs et al. (2013) RCT also compared 
the effects of two practice conditions, each 


conducted in the context of arithmetic con- 
cepts instruction. Nonspeeded practice 
encouraged application of a variety of num- 
ber-principle strategies; speeded practice 
encouraged efficient counting strategies. On 
arithmetic, results clearly favored speeded 
practice. The advantage of speeded over non- 
speeded practice was associated with an ES 
of 0.44, and as reported in Fuchs et al. (2013), 
the estimate specifically for number knowl- 
edge intervention with speeded practice over 
the at-risk control group was 0.87 (vs. 0.51 
for number knowledge intervention with 
nonspeeded practice over control). Results 
thus indicate a substantial contribution for 
speeded strategic practice in improving 
arithmetic outcomes. 

In interpreting this finding, readers should 
note that schools sometimes provide timed 
practice without sufficient scaffolding in 
arithmetic concepts and in massed doses with- 
out support for immediate corrections of 
errors. This study’s intervention, by contrast, 
delivered speeded practice in the context of 
rich, multifaceted instruction on the concep- 
tual bases for arithmetic and formulated prac- 
tice on a distributed basis to help children 
generate many correct responses, develop flu- 
ency with efficient counting strategies, imme- 
diately correct errors, and engage in strategic 
metacognitive behavior (i.e., retrieving 
answers from memory, if confident, otherwise 
using an efficient counting strategy). There- 
fore, findings generalize only to speeded prac- 
tice that incorporates similarly sound, 
theoretically motivated instructional design. 


Does Pre-Intervention Math 
Performance Moderate 
Intervention Effects? 


A well-designed and executed RCT is the gold 
standard for validating an intervention. Yet, as 
noted in the introduction to this article, vali- 
dation does not mean all students respond, 
and little is known about which student char- 
acteristics are associated with inadequate 
response. With the present analysis, we con- 
sidered whether the severity of students’ pre- 
intervention math deficits moderates the 
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effects of generally effective first-grade arith- 
metic intervention. 

We hypothesized that students with weaker 
pre-intervention performance profit less from 
intervention because severely discrepant ini- 
tial performance may signify a severe form of 
learning difficulty (i.e., a learning disability), 
requiring a more individualized or sustained 
(i.e., intensive) form of intervention. This 
would be revealed in a moderator effect in 
which the learning advantage for the inter- 
vention over the at-risk control group is 
weaker for students with lower pre-interven- 
tion math skill than students with higher pre- 
intervention math skill. Understanding the 
tenability of this hypothesis is important for 
gauging the robustness with which an inter- 
vention addresses the full range of at-risk 
learners and identifying interventions that do 
and do not adequately address the needs of 
at-risk students with severely low pre-inter- 
vention math skill. 

Our analyses looked at a pre-intervention 
math skill moderator effect on each out- 
come for two intervention contrasts: the 
effects of intervention (a) between students 
who receive the number knowledge inter- 
vention (aggregated across both practice 
conditions) versus at-risk control children 
and (b) between students in the two practice 
conditions. Contrary to expectations, for 
both contrasts on all three outcomes, the 
effects of intervention operated in parallel 
ways, regardless of the level of students’ 
pre-intervention math skill. 

Thus, to the major question posed in this 
article, we conclude that the effects of first- 
grade arithmetic intervention versus the at-risk 
control group on all three outcomes are robust 
across the distribution of children’s pre-inter- 
vention mathematics performance. This is also 
the case for the superiority of speeded practice 
over non-speeded practice on arithmetic. So, 
severely low-performing students, presumed 
to have most severe risk for learning disabili- 
ties, benefit from conceptual arithmetic inter- 
vention comparably well, and they also enjoy 
stronger outcomes on arithmetic when inter- 
vention focused on the conceptual bases for 
arithmetic is combined with speeded strategic 
practice. 


[T]he effects of first-grade 


arithmetic intervention versus the 
at-risk control group on all three 
outcomes are robust across the 
distribution of children’s pre- 
intervention mathematics 
performance. 


This finding corroborates Fuchs et al. 
(2016), in which a fractions intervention 
proved similarly efficacious across the con- 
tinuum of pre-intervention whole-number 
performance. This suggests that mathematics 
interventions are robust across pre-interven- 
tion mathematics skill levels, although this 
question needs to be investigated on an inter- 
vention-specific basis. For the intervention at 
hand, Galaxy Math, finding that pre-interven- 
tion mathematics skill is not an indicator of 
inadequate response to intervention still 
leaves open the search for individual differ- 
ences that are associated with responsiveness 
to intervention. Such information is needed to 
help schools circumvent students experienc- 
ing months of failure to an intervention that 
will eventually prove inadequate. 


Rates of Individual Student 
Responsiveness: Implications for 
Intensive Intervention 


The present study’s analyses of rates of indi- 
vidual student responsiveness make clear why 
the search for individual differences associ- 
ated with responsiveness to intervention is 
important. On the study’s aligned outcome, 
arithmetic, the response rate, when indexed in 
terms of normalized growth, was strong: Only 
6% of intervention students in the speeded 
practice condition and 13% of intervention 
students in the nonspeeded practice condition 
responded inadequately. In an absolute sense, 
these rates are encouragingly low, represent- 
ing 1.5% to 3.25% of the general population. 
Even so, not all intervention children met the 
criterion for adequate improvement. 
Moreover, when using normalized post- 
intervention performance level as the criterion, 
the inadequate response rate on the arithmetic 
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outcome was higher: 48% of intervention stu- 
dents in the speeded practice condition com- 
pleted intervention below the 25th percentile 
of not-at-risk classmates; 64% of intervention 
students in the nonspeeded practice condition. 
These rates of inadequate post-intervention 
performance among children who received the 
highly efficacious intervention are disturb- 
ingly high. 

Higher rates of inadequate response when 
using post-intervention performance level than 
when using pre- to postintervention improve- 
ment have been reported elsewhere (e.g., 
Frijters et al., 2013; Fuchs et al., 2016). They 
suggest that although the vast majority of 
at-risk students improve nicely with highly 
efficacious intervention, many complete inter- 
ventions inadequately prepared at-risk students 
to keep pace with classmates as new mathemat- 
ics content was introduced in classrooms. This 
is not surprising given that at-risk students start 
intervention performing lower than not-at-risk 
classmates and because not-at-risk classmates 
are enjoying a period of rapid development on 
the three math outcomes during first grade. 

At the same time, we caution readers that 
the criterion applied in the present study and 
commonly used in other responsiveness to 
intervention studies to determine inadequate 
post-intervention performance (scoring below 
the 25th percentile of a normative sample, 
here not-at-risk classmates) is arbitrary. Also, 
benchmarks for satisfactory post-intervention 
performance may need to differ as a function 
of grade level and the subdomain of mathe- 
matics in order to forecast long-term success 
in the general education program. 

Therefore, a line of research is needed to 
provide the field with empirical standards for 
benchmark performance that distinguish stu- 
dents who should exit intervention from those 
who require more sustained or individualized 
(i.e., intensive) intervention. Requiring students 
to meet empirically derived post-intervention 
benchmarks prior to exiting intervention may 
address intervention fade-out effects (e.g., 
Clarke, Doable, Smolkowski, Nelson, Fien, 
Baker, & Kosty, 2016; Smith et al., 2013), in 
which validated interventions improve out- 
comes relative to an at-risk control group, as 
indexed at the end of intervention, but fail to 


decrease achievement gaps sufficiently to protect 
intervention at-risk children from mathematics 
difficulties in the long term. 

We also note that in the present study, a sim- 
ilar pattern occurred on the two transfer out- 
comes, where higher rates of inadequate 
response occurred when the index was normal- 
ized post-intervention performance level than 
when normalized growth rate was employed. 
Rates of inadequate response (post-interven- 
tion performance level as well as on growth) 
were also higher on the complex calculations 
transfer outcome than for arithmetic. Inadequate 
response rates were higher still for the more inte- 
grative, challenging form of transfer on number 
knowledge. This underscores the importance of 
establishing empirical benchmarks for long- 
term success using a variety of outcomes, not 
Just those proximal to the intervention content. 


Conclusions 


This analysis provides the basis for four main 
conclusions. First, efficacy for the first-grade 
mathematics intervention, when convention- 
ally framed as stronger outcomes for at-risk 
intervention students compared to at-risk con- 
trol students, is strong. Without intervention, 
students complete first grade with demonstra- 
bly and reliably worse mathematics perfor- 
mance on arithmetic, complex calculations, 
and integrative number knowledge than would 
be the case without that intervention. Second, 
caution is in order, as revealed in the individual 
student response data. These analyses remind 
us that strong efficacy does not provide the 
basis for assuming all at-risk students respond. 
This indicates the importance of identifying 
reliable methods to forecast, before interven- 
tion begins, which students need to proceed 
directly to more intensive intervention. 


[R]esearch is needed to provide 
schools with technically strong 
post-intervention benchmarks for 
identifying students who are not 
adequately prepared to exit 
intervention and instead require 
more sustained, intensive services 
to avoid long-term failure. 
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In the present analyses, however, students’ 
pre-intervention math performance did not 
provide the basis for forecasting which stu- 
dents will and will not demonstrate adequate 
response. This was also the case for a fourth- 
grade fractions intervention (Fuchs et al., 
2016). Our third conclusion, therefore, is that 
intervention efficacy researchers must con- 
tinue to examine moderator effects, consider- 
ing not only pre-intervention math skill but 
also other student-level variables theoretically 
connected to the design of interventions. 

In the absence of a reliable means for fore- 
casting which students will and will not respond 
adequately to an intervention, the focus turns to 
methods for reliably distinguishing students 
who have and have not adequately responded 
at the end of intervention. Our fourth conclu- 
sion is that research is needed to provide 
schools with technically strong post-interven- 
tion benchmarks for identifying students who 
are not adequately prepared to exit intervention 
and instead require more sustained, intensive 
services to avoid long-term failure. 

It is therefore unfortunate that few reports 
of intervention efficacy contextualize at-risk 
intervention student outcomes with respect 
to not-at-risk classmates (or with respect to 
normative frameworks from nationally 
normed tests). The goal of a multitier system 
of supports is to provide short-term interven- 
tion in a timely way to provide at-risk stu- 
dents with an academic boost that helps them 
succeed in the general education program 
without further support. Addressing the 
needs of approximately 50% of at-risk learn- 
ers with validated intervention, as in the 
present analysis (across arithmetic and trans- 
fer measures), is an important contribution 
toward this goal. Equally important is timely 
identification of the remaining students who 
require intensive (more sustained or more 
individualized) intervention. 
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